Fundamental Limits of Distributed Caching in D2D 

Wireless Networks 



Mingyue Ji, Giuseppe Caire and Andreas F. Molisch 

Department of Electrical Engineering 

University of Southern California 

Email: {mingyuej, caire, molisch}® usc.edu 



m 

o 

Oh 
< 

<N 



09 

o 



> 

in 

oo 

o 
m 



5 



Abstract — We consider a wireless Device-to-Device (D2D) net- 
work where communication is restricted to be single-hop, users 
make arbitrary requests from a finite library of possible files and 
user devices cache information in the form of linear combinations 
of packets from the files in the library (coded caching). We 
consider the combined effect of coding in the caching and 
delivery phases, achieving "coded multicast gain", and of spatial 
reuse due to local short-range D2D communication. Somewhat 
counterintuitively, we show that the coded multicast gain and the 
spatial reuse gain do not cumulate, in terms of the throughput 
scaling laws. In particular, the spatial reuse gain shown in 
our previous work on uncoded random caching and the coded 
multicast gain shown in this paper yield the same scaling laws 
behavior, but no further scaling law gain can be achieved by 
using both coded caching and D2D spatial reuse. 

I. Introduction 

Wireless traffic is dramatically increasing, mainly due to 
on-demand video streaming [ 1 ]. One of the most promising 
approaches for solving this problem is caching, i.e. storing 
video files in the users' local caches and/or in dedicated helper 
nodes disseminated in the network coverage area (2)-||5j. 
Capitalizing on the fact that user demands are highly redundant 
(e.g., n « 10000 users in a university campus streaming 
movies from a library of m « 100 popular titles), each 
user demand can be satisfied through local communication 
from a cache, without requiring a high-throughput backhaul 
to the core network. Such backhaul would constitute a major 
bottleneck, being too costly or (in the case of mobile helper 
nodes) completely infeasible. in the case of wireless helper 
nodes. 

In particular, a one-hop Device-to-Device (D2D) communi- 
cation network with caching at the user nodes is studied in (4). 
The network is formed by n user nodes, each of which stores 
M files from a library of m files. Under the simple protocol 
model of |6], we showed that by using a well-designed random 
caching policy and interference- avoidance transmission with 
spatial reuse, such that links sufficiently separated in space 
can be simultaneously active, as n,m — » oo with n ^> m 
the throughput per user behaves as (^) and the outage 
probability, i.e., the probability that a user request cannot be 
served, is negligible. Furthermore, this scaling is shown to be 
order-optimal under the considered network model. [] 

dotation: given two functions / and g, we say that: 1) f(n) = O (g(n)) 
if there exists a constant c and integer TV such that f(n) < cg(n) for n > N. 
2) f(n) = 8 (g(n)) if f(n) = O (g(n)) and g(n) = O (/(n)). 



A different approach to caching is taken in J7J, where a 
system with a single transmitter (e.g., a cellular base station) 
serving n receivers (users) is considered. The user caches have 
again size of M files. However, instead of caching individual 
files or segments thereof, coded caching is used. Files are 
divided into packets (sub-packetization), and carefully de- 
signed linear combinations of such packets are cached. The 
delivery phase consists of the multi-cast transmission of a 
sequence of coded packets by the base station such that the 
maximum required number of transmitted packets over any 
arbitrary set of users' demands is minimized. The scheme of 
1 7 1 achieves min-max number of transmissions that is given by 



(i 



M > 



1+^ 



- . This scheme is shown to be approximately 



optimal, by developing a cut-set lower bound on the min- 
max number of transmissions. Notice that for n ^> m, the 
throughput scaling is again given by 6 (— ). 

Notice that a conventional system, serving each user demand 
as an individual TCP/IP connection to some video server in 
some CDN [8| placed in the core network, as it is currently 
implemented today, yields per-user throughput scaling 6 ( ^ ) . 
Instead, both the caching approaches of |4] and of IT) yield 
(— ), which is a much better scaling for n ^> m, i.e., in 
the regime of highly redundant demands, for which caching is 
expected to be efficient. The D2D approach of |4] makes use 
of the spatial reuse of D2D local communication, while the 
approach of |7] makes use of coding. In terms of throughput 
scaling laws, D2D spatial reuse and coded caching yield the 
same gain over a conventional system. Furthermore, both the 
achieved spatial reuse and the achieved coded caching gains 
are shown to be optimal. 

II. Overview of the Main Results 

A natural question at this point is whether any gain can be 
obtained by combining spatial reuse and coded caching. In this 
paper, we consider the same model of D2D wireless network 
as (4j, but we consider coded caching and delivery phases. 
The main contributions of this paper are as follows: 1) if no 
spatial reuse is possible (i.e., only one concurrent transmission 
is allowed in the network), the proposed coded caching and 
delivery scheme with sub-packetization achieves almost the 
same throughput of [7], without the need of a base station; 
2) when spatial reuse is possible, then for any combination of 
spatial reuse and coded caching, the throughput has the same 
scaling law (with possible different constant) of the reuse-only 
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Fig. 1. 
separation s - 



a) Grid network with n = 49 nodes (black circles) with minimum 
-j= . b) An example of single-cell layout and the interference 
avoidance TDMA scheme. In this figure, each square represents a cluster. The 
gray squares represent the concurrent transmitting clusters. The red area is the 
disk where the protocol model allows no other concurrent transmission. R is 
the worst case transmission range and A is the interference parameter. We 
assume a common R for all the transmitter-receiver pairs. In this particular 
example, the TDMA parameter is K = 9. 



case |4) or the coded-only case |7|. Counterintuitively, this 
means that it is not possible to cumulate the spatial reuse gain 
and the coded caching gain, as far as the throughput scaling 
law is concerned. It follows that the best combination of reuse 
and coded caching gains must be sought in terms of the actual 
throughput in bit/s/Hz (i.e., in the constants at large, but finite, 
n,m, M), rather than in terms of scaling laws. 



The paper is organized as follows. Section III presents the 
network model and the formal problem definition. We illustrate 



all the main results in Section IV and give some discussions 



in Section [VI] Due to the space limit, proofs and details are 
omitted and can be found in [9|. 

III. Network Model and Problem Definition 

We consider a grid network formed by n nodes hi = 
{ui,...,u n } placed on a regular grid on th e unit square, 
with minimum distance 1/y/n. (see Fig. 1(a)). Users u G U 
make arbitrary requests f u G T = {/i, • • • , / m }» fr° m a fixed 
file library of size m. The vector of requests is denoted by 
f = (/ui 5 • • • 5 fu n )- Communication between user nodes obeys 
the following protocol model: if a node i transmits a packet to 
node j, then the transmission is successful if and only if: a) 
The distance between i and j is less than r; b) Any other node 
k transmitting simultaneously, is at distance d(k,j) > (1+A)r 
from the receiver j, where r, A > are protocol parameters. 

In practice, nodes send data at some constant rate C r 
bit/s/Hz, where C r is a non-increasing function of the trans- 
mission range r (3). 

Unlike live streaming, in video on-demand, the probability 
that two users wish to stream simultaneously a file at the 
same time is essentially zero, although there is a large re- 
dundancy in the demands when n ^> m. In order to model 
the intrinsic asynchronism of video on-demand and forbid 
any form of uncoded multicasting gain by overhearing "for 
free" transmissions dedicated to other users, we assume that 
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Fig. 2. Qualitative representation of our system assumptions: each user caches 
an entire file, formed by an arbitrarily large number of chunks. Then, users 
place random requests of finite sequences of chunks from files of the library, 
or random duration and random initial points. 



each file in the library is formed by L packets)^] Then, we 
assume each user downloads a randomly selected segment 
of length V packets of the requested file, as qualitatively 
shown in Fig. [2] According to our model, a demand vector 
f is associated with a list of random pointers s with elements 
s u G {1, . . . , L— Z/+1} such that for each u demanding file f u 
the corresponding segment of packets s u , s u +l, . . . , s u +L f — 1 
is downloaded. Here, s is random i.i.d., and it is known as 
side information by all nodes, and explicit dependency on s is 
omitted for simplicity of notation. We let Wi denote packet 
j from file / G T. We assume that each packet contains B 
information bits, such that {Wj} are i.i.d. random variables 
uniformly distributed over {1, 2, 3, • • • ,2 B }. We are interested 
in the regime of fixed V and L —> oo, such that the probability 
that segments of different users overlap vanishes. Then, have: 

Definition 1: (Coded Cache Phase) The caching phase is 
a map of the file library T onto the cache of the users in U. 
Each cache has size M files. For each u G U, the function 
(j) u : F^ 1 — >> F 2 generates the cache content Z u = 

MWfj = l,---,m,j = !,■■■, L). 

Definition 2: (Delivery Phase) Let R^ denote the number 
of bits needed to be transmitted by node u to satisfy the request 

R T 

vector f. Then, we define the rate of node u as R u = -gfj. 

F 2 u generates the 



¥ MBL x Jn 



The function ip u . u. 2 
transmitted message X u j = ip u (Z u ,f) of node wasa function 
of its cache content Z u and of the demand vector f . We denote 
the set of nodes whose transmitted information is useful at 



node u is V v . The function A 7/ : F 

7 BL' 



BL ' *±2ie-v u R i 



x F 2 MBL x 
T n — > F 2 decodes the request of user u from all messages 
received by users V u and its own cache, i.e., we have 



W u ,f±\ u ({Xi, f :i€Vu},Z u ,f). 



(1) 







2 This is compliant with current video streaming protocols such as DASH 
(2J, where the video file is split into segments which are sequentially 
downloaded by the streaming users. 



The worst-case error probability is defined as 

Letting R = ^Z ueU R u , the cache-rate pair (M, i?) is achiev- 
able if V £ > there exist a set of cache encoding functions 
{</> n }, a set of delivery functions {ip u } and a set of decoding 
functions {A n } such that P e < e. Then the optimal achievable 
rate [j is given by 

R*(M) = mf{R : (M, R) is achievable}. (3) 

In order to relate the rate to the throughput of the network, 
defined later, we introduce the concept of transmission policy. 

Definition 3: (Transmission policy) The transmission pol- 
icy 11^ is a rule to activate the D2D links in the network. Let 
C denote the set of all directed links. Let iC 2^ the set of all 
possible feasible subsets of links (this is a subset of the power 
set of £, formed by all sets of links forming independent sets in 
the network interference graph induced by the protocol model). 
Let A c A denote a feasible set of simultaneously active links. 
Then, U t is a conditional probability mass function over A 
given f (requests) and the coded caching functions, assigning 
probability lit (A) to A G A. 

In this work, we use a deterministic transmission policy, 
which is a special case of random policy defined above. 
Suppose that for a given caching/delivery scheme (M, R) is 
achievable. Suppose also that for a given transmission policy 
U t , the RBL' coded bits to satisfy the worst-case demand 
vector can be delivered in t s channel uses (i.e., it takes t s 
channel uses to deliver the required BL'R U coded bits to each 
user u G U, where each channel use carries C r bits). Then, 
the throughput per user measured in useful information bits 
per channel use is given by 

T^. (4) 

The pair (M, T) is achievable if (M, R) is achievable, 
and if there exists a transmission policy U t such that the 
RBL' encoded bits can be delivered to their destinations in 
t s < (BL')/T channel uses. Then, the optimal achievable 
throughput is defined as 



T*(M) = sup{T : (M,T) is achievable} 



(5) 



In the following we assume that the necessary condition 
Mn > m such that any demand can be satisfied. Otherwise, 
the file library cannot be entirely cached in the union of 
the user caches, and some demands cannot be satisfied. With 
random demands, such setting can be handled by defining a 
throughput versus outage probability tradeoff, as we did in [4]. 
However, random demands are not considered in this work. 

We observe that our problem includes two parts: 1) the 
design of the caching, delivery and decoding functions; 2) 

3 As a matter of fact, this is the min-max number of packet transmissions 
where min is over the caching/delivery scheme and max is over the demand 
vectors, and thus intuitively is the inverse of the "rate" commonly used in 
communications theory. We use the term "rate" in order to stay compliant 
with the terminology introduced in IT]. 



scheduling concurrent transmissions in the D2D network. In 
the analysis, for simplicity, we start by not considering the 
scheduling problem and let the transmission range r such that 
any node can be heard by all other nodes (r > y/2). In this 
case, only one simultaneous active link can be supported by the 
network. Then, we will relax the constraint on the transmission 
range r and consider spatial reuse and scheduling. 

IV. Main Results 

A. Transmission range r > y/2 

For r > y/2, the actual users spatial distribution is irrelevant. 
The following theorem yields the achievable rate obtained by 
our proposed constructive coded caching and delivery scheme. 



Theorem 1: For r > y/2 and t 
rate is achievable: 



Mn 



G Z + , the following 



R(M) 



m 
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1 



M\ 

m J 



(6) 



Moreover, when t is not an integer, the convex lower envelope 
of R(M) is achievable. □ 

The caching and delivery scheme achieving ^ is given 
in (9) and an illustrative example is given in Section |V| The 
corresponding achievable throughput is given by the following 
immediate corollary. 

Corollary 1: For r > y/2, the throughput 

W = Wfy (7) 

where R(M) is given by ^ is achievable. □ 

Proof: In order to deliver BL'R{M) coded bits without 
reuse (at most one active link transmitting at any time) we need 
t s = BL'R(M)/C r channel uses. Therefore, ^\ follows from 
the definition ^. ■ 

A lower bound (converse result) for the achievable rate in this 
case is given by the following theorem. 

Theorem 2: For r > y/2, any achievable rate is lower 
bounded by 



R* (M) > max < max 

^sG{l,2,--- ,min{m,n}} 



LtJ 



M 



n 



1 



M 



m 



(8) 



□ 



Given the fact that activating a single link per channel use 
is the best possible feasible transmission policy, we obtain 
trivially that using the lower bound ^ in lieu of R(M) in (J7| 
we obtain an upper bound to any achievable throughput. The 
order optimality of our achievable rate is shown by: 

Corollary 2: When t = ^f G Z+, the ratio of the 
achievable over the optimal rate is upper bounded by 



R(M) 
R*(M) 



M{1- A) (a 



(1-A)M N 
A , 



(9) 



where A — */l 



M+l- 

T* ( M} 

Obviously, the same quantity upper bounds the ratio T ( M ) • 



□ 



B. Transmission range r < y/2 

In this case, the transmission range can be picked arbitrarily 
in order to force D2D communication to be localized and 
allow for some spatial reuse. In this case, we need to design 
also a transmission policy to schedule concurrent active links. 
The proposed policy is based on clustering: the network is 
divided into clusters of equal size g c , independently of the 
users' demands. Users can obtain the requested files only from 
nodes in the same cluster. Therefore, each cluster is treated as 
a small network. Assuming that g c M > m, the total storage 
capacity of each cluster is sufficient to store the whole file 
library. Under this assumption, the same caching and delivery 
scheme used to prove Theorem [T] can be used here. A simple 
achievable transmission policy consists of partitioning the set 
of clusters into K subsets, such that the clusters of the same set 
do not interfere, activate simultaneously one link per cluster 
in each subset, and use TDMA in order to avoid interference 
between the clusters. This is a classical time-frequency reuse 
scheme with reuse factor K fT0| Ch. 17], as shown in Fig. 1 1(b) 
In particular, we can pick K = (|"\/2(1 + A)] + l) . This 
scheme achieves the following throughput: 

Theorem 3: Let r such that any two nodes in a "squarelet" 
cluster of size q c can communicate, and let t = ^^- G Z + . 
Then, the throughput 

a i 



T(M) 



(10) 



K R(M) ' 

is achievable, where R(M) is given by Theorem [T] r is the 
transmission range and K is the TDMA parameter. Moreover, 
when t ^ Z + , T(M) can be computed by using the convex 
lower envelope of R(M). □ 

Notice that whether reuse is convenient or not in this context 
depends on whether C^ ( me link spectral efficiency for 
communicating across the network) is larger or smaller than 
C r /K, for some smaller r which determines the cluster size. 
In turns, this depends on the dependency of the link spectral 
efficiency on the communication range. This aspect is not 
captured by the protocol model, and the answer may depend 
on the operating frequency and appropriate channel model of 
the underlying wireless network physical layer. 

An upper bound on the throughput with reuse is given by: 
Theorem 4: When r < y/2 and the whole library is cached 
within radius r of any node, the throughput is upper bounded 
by 



T*(M) < 



ap 



A 2 



max, 



G{1,2, 



{m,\7rr 2 n^}} ( s ~ ^Jf] M ) 



(ID 



,mirH m, 7rr^n 



where r is the transmission range and A is the interference 
parameter. □ 

Similarly to the case of r > y/2, we have the upper bound 
on the optimal to achievable throughput ratio: 



T*(M) 
T(M) 



*r&i 



M(l-A) (l 



where A 



M+l 



and for t 



Mg c 



(1-A)M\ ' 

A J 

GZ+. 



(12) 




Fig. 3. The augmented network when m = 3, n = 3. The three requested 
vectors are: (A, B, C), (B, C, A) and (C, A, B). 



V. An Example 

The proposed caching placement and delivery scheme is 
illustrated through a simple example. Consider a network with 
three users (n = 3). Each user can store M = 2 files, and the 
library has size m = 3 files, which are denoted by A, B, C. Let 
r > y/2. Without loss of generality, we assume that each node 
requests one packet of a file (V = 1). We divide each packet 
of one file into 6 sub-packets, and denote the sub-packets of 
the j-th packet as {Ajj : £ = 1, . . . , 6}, {Bj^ : I — 1, . . . , 6}, 
and {Cj^ : £ = 1, . . . , 6}. The size of each sub-packet is F/6. 
We let user u stores Z Ui u — 1, 2, 3, given as follows: 

%i =(Aj,i,Aj^, Aj$,Aj^, Bj : i,Bj : 2, Bj,3, Bj,4i 

Cj,i : Cj,2,Cj,3,Cj : 4)ij = 1, • • • , L. (13) 

%2 =(Aj,i,Aj^, Aj^,Ajfi, Bj : i,Bj^2, Bj,b, Bjfii 

Cj,i,Cj,2,Cj£,Cjfi),j = 1, • • • ,L. (14) 

Z3 =(Aj,3, Aj^, Aj^,Ajfi, Bj£, Bj^ Bj^, Bj^, 

Cj,3,Cj^,Cj^,Cjfi)ij = l,''- ,L. (15) 

In this example, we consider the demand f = (A, 5, C) with 
initial point in the requested segment s = (1,2,3), i.e., user 

1 requests packet 1 of file A, user 2 requests packet 2 of file 
B and user 3 requests packet 3 of file C. Then, the delivery 
scheme is the following. User 1 transmits £?2,3 + C^i. User 

2 transmits Ai : $ + C%^. User 3 transmits A\$ + #2,4- Thus, 
Ri + R2 + R3 — g '3 = 2 • 

Next, we illustrate the idea of the general achievable rate 
lower bound of Theorem [2] Without loss of generality, we 
assume that L/L' is an integer and let s denote the segment 
index. For any scheme that satisfies arbitrary demands f , with 
arbitrary segments s, we denote by R^ s f the number of 
transmitted bits for user u, requesting segment s when the 
request vector is f. Since the requests are arbitrary, we can 
consider a time extension for all possible request vectors. 
For example, we let the first request be f = (A,B,C), the 
second request be f = (5, C, A) and the third request be 
f = (C, A, B). Then, the augmented time-extended graph is 



shown in Fig. [3] Considering user 3, from the cut that separates 

(^l,(A,B,C) •> X 2 ,(A,B,C) •> X\ x (B,C k A) •> ^2,(£,C,A) •> ^1,(C,A,B) j 

^2,(c,a,b): Z3) an d (Wcj Waj Wb), we can obtain that 



s=l 



7 , (Rl,s,(A,B,C) +^2,s,(A,£,C) + R l,s,(B,C,A) + ^2,s,(£,C,A) 



+^l,a,(C,A,B) + R 2,s,(C,A,B) ) + ^^ > 3#L' • L/L. 



(16) 



Similarly, from the cut that separates (Xi,(a 5J b,c)> ^3,(a,b,c)> 

^1,(£,C,A)> ^3,(£,C,A)> ^1,(C,A,B)^ -^3,(C,A,B)» ^2) and 

(Wb, Wc? Wa), and from the cut that separates (^m^c)* 

^3,(A,S,C)^ X 2 ^B,C,A)> X$,(B,C,A)> ^2,(C,A,£)> ^3,(C,A,£)> 

Zi) and (Waj Wb, Wc)> we can obtain similar formulas. By 
summing ( [T6| ) and the other two corresponding formulas and 
dividing all terms by 2, we obtain 



L' 



/ j \Rl,s,(A,B,C) + ^2,s,(A,B,C) + ^3,s,(A,£,C) + ^l,s,(S,C,A) 
+^2,s,(£,C,A) + ^3,s,(B,C,A) + ^l,s,(C,A,£) + ^2,s,(C,A,B) 

+#L(C„4 )B) ) + |^SL > ?BL. (17) 

Noticing that, by symmetry, R T = R± 



,s,f " 



any s and f, we have 
3L 



R 2,s,f 



^, s ,f for 



the scheme in [7J (see Section [I]), where the base station has 
access to all the files. Comparing this rate with our Theorem [T] 
we notice that they differ only in the last term (global caching 
gain), which in the base station case is given by (1 + Ik ^)~ 1 - 
For nM ^> m, we notice that these factors are essentially 
identical. 

The lower bound gap of ^ shows that, when M is a 
constant, and m ^> 1, the achievable rate achieves the same 
order of the converse. The multiplicative gap between the 
achievable rate and the converse lower bound is a decreasing 
function of M between 5.83 (for M = 1), and 4 (for M 
asymptotically large). 

As already noticed, Theorem [3] shows that there is no 
fundamental cumulative gain by using both spatial reuse and 
coded caching. Under our assumptios, spatial reuse may or 
may not be convenient depending whether ^p is larger or 
smaller than C^. A closer look reveals a more subtle tradeoff. 
Without any spatial reuse, the length of the codewords for each 
user, related to the size of the sub-packetization, is ( M n ). 
This may be very large when n and M are large. At^the 
other extreme, we have the case where the cluster size is the 
minimum able to cache the whole library in each cluster. In 
this case, we can just store M different whole files into each 
node, such that all m files are present in each cluster, and 
for the delivery phase we just serve whole files without any 



L> RT> -2 BL 



-MBL. 



(18) 



Dividing both sides by 3BL, we obtain that any achievable 
coding scheme must satisfy 



dT o 1 

R( M ) = ^- > - - -M. 

v } BU ~ 2 2 



(19) 



In the example of this section, for M = 2 we obtain R*(2) > 
\ . Therefore, in this case the achievability scheme is optimal. 
Considering the same network n = 3 users with storage 
capacity M = 2 and library size m = 3, but adding a special 
node (base station) with all files available, the coded caching 
scheme of |7] achieves R(2) = |. Then, in this case, the 
relative loss incurred by not having a base station with all 
files available is 3/2. 

VI. Discussions 

The achievable rate of Theorem Q] can be written as the 
product of three terms, R(M) = n (l - ^) ^ with the 
following interpretation: n is the number of transmissions by 
using a conventional scheme that serves individual demands 
without exploiting the demand redundancy; (l — ^) can be 
viewed as the local caching gain, any user can cache a 
fraction M/m of any file, therefore it needs to receive only 
the remaining part; -^ is the global caching gain, i.e., the 
ability of coded caching and delivery to turn the individual 
but overlapping demands into a coded multicast, such that 
transmissions are useful to many users despite the streaming 
sessions are strongly asynchronous. These three terms with the 
same interpretation can be found also in the rate expression of 



coding as in |4|. In this case, the achieved throughput is — — 



K m 



bits/s/Hz, which is almost as good as our proposed scheme 
( ^ — y). This simple scheme is a special case of the general 

K \M ~ 1 J 

setting treated in this paper, where spatial reuse is maximized 
and codewords have length 1 . If we wish to use the achievable 
scheme of this paper, the codewords length is ( J^e ) . Hence, 
spatial reuse yields a reduction in the codeword length of the 
corresponding coded caching scheme. 
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